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AI^STRACT 

We develop a procedure for calculating a kxn rank k matrix B 
for data compression using the Bhattacharyya bound on the proba- 
bility of error and an Iterative construction using Householder 
transformations. Two sets of remotely sensed agricultural data 
are used to demonstrate the application of the procedure. The 
results of the applications give some indication of. the extent to 
which the Bhattacharyya bound on the probability of error is af- 
fected by such transformations for multivariate normal popula- 
tions. 

I. 

1. ll^RODUCTION 

i For n-dlmenslonal normal classes 1 = l,,,.,m, the 

1972) for class 1 and j is 


Bhattacharyya coefficient (Andrews , 


I 


I 


given by: 


pCi.j) = (qj^q.,)^ Tjj{p^(x)Pj(x)}'^dx 

R 


and the Bayes probability of error (Anderson, 1958) (Andrews, 1972) 


P = 1 - 7 max {q p.(x)}dx 

® I4l4m ^ ^ 


where p^|.(x) denotes the conditional density of the random vari- 
able X given that X N(/u^.i;^) and q^, ... ,q^, respectively , 
denote the (known) a priori probabilities of the classes N(^E^) 

i — 1 , . . . ,m . 

It has been shown (Andrews, 1972) (Kaileth, 1967) that 

g -SL V-C > 

j^l 1 J 1 J 

If one considers a kxn rank k linear transformation B of the ran- 
dom variable X (i.e., Y=BX) , then the Bhattacharyya coefficient 

T 

for class 1 and j for the classes N(B^^,BZ^B ), i = l,...,m is: 

Pg(i»j) = <qiq/}^k^Pi(y»i^)p^Cy.B)}'^dy 

T 

and the Bayes probability of error for the classes N(Bp,^,BZ^B ), 
i « 1 , . . . ,m is ; 


V = 1 - Ik ^p4(y.B)>dy 

e'“ '' •'R I4i4m ^ 


where p. (y»B), i = l,...,m denotes the conditional density of the 
1 T 

random variable Y ~ BX given that Y Z'' N(B^^,BE^B ). It follows. 


1 


I 


since P 


m-i ^ 
i=l j=i+l 


Chai. 


P^(B) 4 P(B) B ^ 


and moreover, (Decell and Quirein, 19^3) (Kaileth, 1967), that 

(1) P^4P^(B)4 p(B). 

(2) P = P (B) if and only If p = p(B) . 
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2. THEORETICAL PRELIMINARIES 

Let k be an integer (0 < k < ii) , and N(^ , 2^) i = l,..,,m 
be n-yariate normal populations with a priori probabilities 
^1**'*’%!* ■ij'ould like to construct a kxn rank k matrix B that 

will minimize p (B) . The theoretical extent to which this is pos- 
sible and the basis for the construction (Decell and Smiley, to 
appear) is summarized in the following; theorem. Let 
C = { u eR^^rljull = 1} and T(H) ={H=I-2 uu^: u e C} denote the 


set of Householder transformations on r” (Householder, 1958). 
Theorem. For each positive i, let H^ r. T(H) be chosen such that 

' p(dk|z)HJ= g.l.b p((I |Z)H) 

HeT(H) 


and 


p!'((Ij^|Z)H^^^H.---H^)= g.l.b.p((Ij^|z)HH^-*-Hj^) 

H£T (H) 


then, 

(]^ P<(Ij^| 2)H^^^H^--*H^) <p((Ij^|Z)H^*-H^). 

(2) p((Ij^|Z)H^_^-*-H^)<p((Ij^|Z)H^*--H^H, H e T(H)). 

(3) p((I^jZ)H.^^H^-;-H^)<p((Ij^|Z)HH^..*H^ , H e T(H)). 

and p = 0, . . . ,i-2. 

00 

(5) The monotone sequence of real numbers ^Pd^)}^_j^ where 


1 


I 


I. 


Is bounded below by and hence 

lira p(B^) = g.l.b. 
l-+~ i 


We know (Decell and Quire In, 1973) that there is some kxn rank 
k matrix, say B, that minimizes P(B). If p(B) < ®"^*°*|p(B^)| 

we will call the sequence sub optimal ( optimal in the 

case of equality). There are several results (Decell and Smiley, 
to appear) that lend credibility to the conjecture that the seq- 
uence is optimal and cofinallv constant beyond the index 
i = min{k,n-k}. We will proceed with the development of an itera- 
tive procedure 'for constructing the subject sequence and, finally, 
tabulate results of applications to remotely sensed agricultural 
data with equal a priori class probabilities. The approach (and 
its merit) will depend upon t'ne bound provided by the inequality 
Pg < i “ 1*2,..., the non- increasing nature of the sequence 

{p('B^)}?_^» and the ability to manipulate the expressions far- 

p(B^) , i = 1,2,... in the cane of normal populations. 


■ 3. THE GRADIENT OF p((Ij^ I Z)H) 

We will develop an expresf.ion (for the case of normal n-vari- 
ate populations N(^^,2^), i = 1,.. .,m) for the gradient of 
P((Ij^|Z)H) where H e T(H) lias the form 


H 


1-2 ^ , X G. 
XX 


This expression will be used in a steepest descent procedure to‘ 
calculate each Householder transformation H^, H^, • • des- 

cribed in the preceding theorem. For m populations N(^S^), 

1 “ l,...,m it is easy to eatabllsh that in order to calculate 
Vr one need c.aly apply the steepest descent procedure to the 
Bhattacharyya coefficient determined by the populations 
N(H^* • *H^^j ,H^* • 'Hj^Z^H^* • *H^) j = 1,.. .,m. 


ORlGUSfAlJ pArp to 


The expression for given by (Andrews, 1972) 

(Kaileth, 1967) (for the case cf equal ^ priori probabilities 
= 1/m, 1 = 1,., .,m): 


(i,j) =iexp -“sj (Z.+Z J~V. . “ ^in/ l^i~^^.i 


■ 4°ij ^±3 






where and = (l^|Z)Hs^H(Ij^|Z)’^, in which 


case, 


m-1 ^ 


If we define 


1 Arp ^ ^ T Ak 1 

and G, =-^ 


ij A°ij '^i °ij 


■In 


|Ei+E^ 




we have that the differential oc P._ ,„v„(i,i) is 

(Ij^IZ)H - 


from whence it follows that 


d(p((T jz)H)) 


■ig,t 


exp(F_+G^j)(d(F^^) + d(G^^)). 


In order to simplify the notation, define E, . = I. + Z. and 
. ij 1 j 

^ij ^ ^Pl“Mj ) V 

Let tr(*) denote the trace of (•) and (*1 *= det(*)* With 
a bit of matrix algebra it follows that 

^ij =“i^^^<^\Iz)H2ijH(Ij^lz)^)"^(Ij^|z)HA^^H(I^^^^ 


I 


and 


‘^ 13 - 


-f ln|(I^^|z)m: H(I|^|2)*| + i ln|(I^|Z)HZ^H(I^|Z) 


13 

+ i ln|(Ij^|z)HZ^H(Ij^iz)'^| + | ln2. 


We will now develop expressions for d(FjLj) and d(G^^j), i,j 
According to Decell and Quirein (1973) 




d(F^j) =-| tr{d((Ij^|Z)H)Q./} 


where B * (I |z)H. and 


T 

XV 

Since H = I - 2 — ^ it follows that 

' X X 


d((I JZ)H) = d((I |Z) (I - 2 )) = -2(1 |Z)d 


f 

X X 




-2CI^|Z) 


T ,/ T- . T V 

X xd(x< ) - XX d(x x) 

T 2 
(xx)'^ 


"“2(^1, I ^) m fp T T T T 

" — — — {x x(d(x)x +xd(x) )-xx (d(x) x+x d(x))} 

, T .2 
(x x) 


— 2(1, jZ) rp ip rji rp T T T T 

V ' {(d(x)x XX +XX xd(x) -XX d(x)x -xd(x) xx } 

ix\y 


• 2 (\! Z) 

(x'^’x) 


Y ' 2 ^ ^ ) xx^-xx^ (d (x) x*^-xd (x) } , 


I 


Substituting the latter in the expression 

' • 

d(F^j) =- I tr } 

and using the fact that tr(AB) = tr(3A), we have 

d(F .) = z-r- I(d(x)x^-xd(x)^)xx^-xx'*^(d(x)x^-xd(x)^)] 

^ ( Cx\)^ 

= — I — Z) [(d(x)x^-xd(x)^)xx^-xx^(d(x)x^-xd(x)^)J} 
(x x) 

= — tr{ xx^Q^ (Ij^l Z) (d(x)x~-xd(x)^)-Q_ (Ij^| Z)xx'^(d(x)x^ 
(x x) / ^ ^ 

-xd(x)^)}. 

With a ^little matrix algebra (and some patience) it follows that 

(xx^Q^j (Ij^l Z) - (Ij^l Z)xx'^) ]xd(x)^> 

We now find an expression for d(Gj^j), First, recall 
(Kullback, 1968) that f- 

ddnlKB'^^l ) = 2tr{d(B)EB^(EEB'^)"h 

so that 

^ d(G^^) = -tr{d((lJZ)H)E^jH(lJZ)^((lJZ)H^^H(igZ)’')"^ 

-|tr{d((I^|Z)H)E^H(Ij^|2)’^((Ij^|Z)HZ^H(Ij^|Z)'’^)"^ 

+ f t^^{d((lJZ)H)Z^H(lJz)'^((Ij^|z)HZjH(lJz)V 


I 


I 


J 


Obviously, the summands in the expression for differ 

from the expression 


d(Fij) = tr{d((Ij^|Z)H)Q^ J 


only by multiplicative constants and the matrix • Hence, we 
may use the final expression for d(F^^) to obtain the expression 
for d(G^^) by simply adjusting the multiplicative constants and 
replacing (in each summand in d(G^^)) with the expressions 

■'ij " 

•• 

ly - ZjH(I^|Z)’^((I|^|Z)HZjH(Ij^|Z) 

At this point we will simplify the notation. Let 

A ^ 

and let ^j^j* ^i j * ^ij .similarly defined by substituting, 

respectively, J ,K.., and L. . Q. . in the expression for Q. .» 

J ^ j j 3 

l,j = 1 m. It ‘follows that 

d(F ) = — ^ ^ tr(Q xdCx)*^) 

(x^x)^ '• 


d(Gij) ^ ^ tr(J^ xd(x)^) 1 — j tr(K xd(x)^) 

(x x) ^ (x x) . 


tr (L^^xd(x)^) . 


(x x) 


In order that X be extremal, it is sufficient that x satisfy 


1 


i 











- - L^j)x = 9. 


Of course, the function G(x) is the gradient of 
T 

p((l |z)(I -• 2“— )) with respect to x, 

^ XX 


With G(x), we use a steepest descent technique to construct 
Hj^. The process is repeated for the construction of H 2 since, 
given the problem of constructing H 2 is identical to that of 
constructing provided the populations are taken to be 

1 = l,...,m. 

Test results are .presented in the following tables for nine 
twelve channel, C-1 flight line agricultural classes: soybeans, 

corn, oats, red-clover, alfalfa, rye, bare soil, and two types of 
wheat. ^ The Hill County data is sixteen channel data for five 
agricultural classes: winter wheat, fallow crop, barley, grass, 

and stubble. 


C-1 FLIGHT LINE DATA 
n = 12, m = 9, k = 6, P = .024 


Iteration 


®2 

®3 

0 

.327 

.109 

.134 

1 

.223 

.060 

.034 

2 

.171 

.062 

.033 

3 

.135 

.068 

.032 

4 

.116 

.058 

.031 

5 

.1157 

.055 

.0309 

6 

.1150 

.054 

.0303 



I 


1 


t 


I 




HILL COUNTY DATA 


n = 16, m = 5, k = 6, p - .107 


Iteration 


®2 


0 

.872 

.336 

.299 

1 

.785 

.310 

.287 

2 

.525 

.286 

.232 

3 

.439 

.273 

.227 

4 

.576 

.267 

.226 

5 

.386 

.265 

.2?, 4 ■ 

6 

.363 

.264 

.223 
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